# Assignment 3: Chicago Crimes # MSDS420

**Author:** Atef Bader, PhD **Edit:** 11/20/2020 **Edit:** 12 Sept 2021 sfd - Added extra sample code
- changed out questions

image.png

Deliverables:

  • Submit two files that are labeled: YourLastName_Assignment_3 that have the following formats:

    1. Your HTML that has your Source code and output
    2. Your ipynb script that has your Source code and output

Objectives:

  • Use SQL to execute different queries to retrieve data from Chicago Crime dataset and Police statins dataset
  • Use Geospatial queries to locate police stations and gun related crimes (with arrest or no arrest) in every district on Choropleth map
  • Use Geospatial queries to provide descriptive stat for every district on Choropleth map
  • Use Geospatial queries to locate the Block that is the furthest (Maximum Distance) from the police station that has gun related crime resulted in arrest

Submission Formats :

  • Complete IPYNB script that has the source code in Python used to access and analyze the data. The code should be submitted as an IPYNB script that can be be loaded and run in Jupyter Notebook for Python.
  • Make sure to include your name as part of the IPYNB file name.
  • From the File menu select Download As -> HTML (.html) to create an HTML of the IPYNB file.

Formatting Python Code:

When programming in Python, refer to Kenneth Reitz’ PEP 8: The Style Guide for Python Code: http://pep8.org/ (Links to an external site.)Links to an external site. There is the Google style guide for Python at https://google.github.io/styleguide/pyguide.html (Links to an external site.)Links to an external site. Comment often and in detail.

Descriptions and Requirement Specifications

Chicago Crimes

In his first state of the uniion address , president Trump mentioned Chicago violance 10 times Trump's State of the Union Address

Chicago has more homicides than New York and Los Angeles combined

Columnist Clarence Page wrote an article , published by the Chicago Tribune stated that the city of Chicago had more homicides in the past two years than New York and Los Angeles combined

Chicago Police Department

Chicago police department CPD issues and publishes on daily basis on its website crime alerts, and press releases for the different districts .

image.png

The CPD categoizes the crimes into 8 categories as follows:

image.png

Chicago Crimes Dataset

The CSV file for crimes dataset for the city of Chicago is obtained from the data portal for the city of Chicago. Here is the link for the city of Chicago data portal City of Chicago Data Portal

image.png

Loading the Dataset CSV file

Three set of data are need for this assignment:

  1. The Chicago police stations in every district (hosted on NW server)
  2. The Crimes dataset (hosted on NW server)
  3. The Boundaries.geojson data for district boundries (downloaded in Zip file)

Complete description of the dataset can be found on Chicago city data portal.

Based on Trumps State of the Uniion Address and the article written by columnist Clarence Page and published by the Chicago Tribune, we are interested to retrieve the data for the past two years and perform different types of spatial queries.

There are few of these queries that we are interested in to help CPD and city of Chicago to plot on a Choroplteh map those districts that have highest gun crimes.

Here are examples of those types of queries:

  1. Plot on Choropleth map the districts and their Violent Crimes
  2. Plot on Choropleth map the districts and their Gun related crimes
  3. Which district is the crime capital of Chicago districts?
  4. What the crime density per district?
  5. Plot on Choropleth map those gun related crimes that resulted in arrests
  6. Plot on Choropleth map the gun related crime that is in the farthest Block from the policy stattion for every district

Packages you need to Connect PostgreSQL server to load and retrieve Crhicago Crime dataset from the database:

  1. psycopg2: for PostgreSQL driver
  2. area: to calculate the area inside of any GeoJSON geometry
  3. Folium: for Choropleth maps </font>

Since we are using PostGIS in our work, please read and bookmark Chapter 4. Using PostGIS: Data Management and Queries

In [1]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install psycopg2-binary
Requirement already satisfied: psycopg2-binary in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (2.9.1)
In [2]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install area
Requirement already satisfied: area in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (1.1.1)
In [3]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install folium
Requirement already satisfied: folium in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (0.12.1)
Requirement already satisfied: branca>=0.3.0 in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from folium) (0.4.2)
Requirement already satisfied: jinja2>=2.9 in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from folium) (2.11.2)
Requirement already satisfied: numpy in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from folium) (1.18.5)
Requirement already satisfied: requests in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from folium) (2.24.0)
Requirement already satisfied: MarkupSafe>=0.23 in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from jinja2>=2.9->folium) (1.1.1)
Requirement already satisfied: certifi>=2017.4.17 in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from requests->folium) (2020.6.20)
Requirement already satisfied: chardet<4,>=3.0.2 in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from requests->folium) (3.0.4)
Requirement already satisfied: idna<3,>=2.5 in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from requests->folium) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /Users/kagenquiballo/opt/anaconda3/lib/python3.8/site-packages (from requests->folium) (1.25.9)
In [4]:
import folium
from folium import plugins
from folium.plugins import MarkerCluster
import psycopg2
import csv
import pandas as pd
import json
from area import area

from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT 
from IPython.display import display
In [5]:
folium.__version__
Out[5]:
'0.12.1'
In [6]:
psycopg2.__version__
Out[6]:
'2.9.1 (dt dec pq3 ext lo64)'

IMPORTANT NOTES:

  • In this assignment you will be connecting to DSCC PostgreSQL server that has the database and tables already created on the server.
  • You need to use your NetID and password for login and passward to connect PostgreSQL server hosted on DSCC
  • For the psycopg2.connect statements listed below, you must provide your NetID and password in order to connect to PostgreSQL server hosted on DSCC
In [7]:
# Use this data set for your final submission of your Assignment 3
# Uncomment the following line after you unit test your code and ready to run and submit 
#your assignment on this dataset 

# db_connection = psycopg2.connect(host='129.105.248.26', dbname="chicago_crimes", user="YourNetID")


# Use the following dataset for unit testing purposes only.  
# Comment the following line when you are done with your unit testing and ready to run your assignment 
#on the complete dataset and submit your Assignment 

db_connection = psycopg2.connect(host='129.105.248.26', dbname="chicago_crimes_ut", user="kjq978")

cursor = db_connection.cursor()
In [8]:
# Uncomment and run this code when your transactions get "stuck"
#db_connection.rollback()

Chicago Crimes Dataset

The Crimes_2001_to_present.csv is downloaded from Chicago data portal and it has roughly 6.5 million records.

While working on this dataset, It is prudent to make a note of the following:

  1. Geospatial queries are very demanding for system resouces like CPU, Memory, and DISK
  2. We are interested in the data set of the past 2 years, and when you execute Geospatial type queries, please be advised that these queries slow down your machine.
  3. There are two dataset that you will use in this assignment
    • chicago_crimes_ut : Use this mini dataset that has data for rougly a month for coding/debugging and unit testing
    • chicago_crimes : Use this dataset that has data for 2 years for your final submission

Algorithm Performance

  • Sort algorithms used by the database engines vary in performance between O($N log N$) and O($ N^{2} $) where $N$ is the size of the number

  • Search algorithms used by the database engines vary in performance between O($log N$) and O($ N $) where $N$ is the size of the number

What are the column names in the 2 tables that we will use from chicago_crimes database?

In [9]:
# Get the column names for table crimes

cursor.execute("SELECT column_name \
    FROM INFORMATION_SCHEMA.COLUMNS \
    WHERE table_name = 'crimes';")

rows=cursor.fetchall()

rows
Out[9]:
[('id',),
 ('caseno',),
 ('date_of_occurrence',),
 ('block',),
 ('iucr',),
 ('primary_type',),
 ('description',),
 ('location_description',),
 ('arrest',),
 ('domestic',),
 ('beat',),
 ('district',),
 ('ward',),
 ('community_area',),
 ('fbi_cd',),
 ('xcoordinate',),
 ('ycoordinate',),
 ('year',),
 ('updated_on',),
 ('latitude',),
 ('longitude',),
 ('location',),
 ('where_is',)]
In [10]:
# Get the column names for table police_stations

cursor.execute("SELECT column_name \
    FROM INFORMATION_SCHEMA.COLUMNS \
    WHERE table_name = 'police_stations';")

rows=cursor.fetchall()

rows
Out[10]:
[('district',),
 ('district_name',),
 ('address',),
 ('city',),
 ('state',),
 ('zip',),
 ('website',),
 ('phone',),
 ('fax',),
 ('tty',),
 ('x_coordinate',),
 ('y_coordinate',),
 ('latitude',),
 ('longitude',),
 ('location',),
 ('where_is',)]
In [11]:
# Get the first five rows of the police station table
query = "SELECT * \
        FROM police_stations"
stations = pd.read_sql_query(query, db_connection)
stations.head()
Out[11]:
district district_name address city state zip website phone fax tty x_coordinate y_coordinate latitude longitude location where_is
0 1 Central 1718 S State St Chicago IL 60616 http://home.chicagopolice.org/community/distri... 312-745-4290 312-745-3694 312-745-3693 1176569.052 1891771.704 41.858373 -87.627356 (41.8583725929, -87.627356171) 0101000020E61000009AEA2C27DFED444001467E9A26E8...
1 2 Wentworth 5101 S Wentworth Ave Chicago IL 60609 http://home.chicagopolice.org/community/distri... 312-747-8366 312-747-5396 312-747-6656 1175864.837 1871153.753 41.801811 -87.630560 (41.8018110912, -87.6305601801) 0101000020E6100000608FECBEA1E64440A2D015195BE8...
2 3 Grand Crossing 7040 S Cottage Grove Ave Chicago IL 60637 http://home.chicagopolice.org/community/distri... 312-747-8201 312-747-5479 312-747-9168 1182739.183 1858317.732 41.766431 -87.605748 (41.7664308925, -87.6057478606) 0101000020E6100000DB984B681AE244409B14AC92C4E6...
3 4 South Chicago 2255 E 103rd St Chicago IL 60617 http://home.chicagopolice.org/community/distri... 312-747-7581 312-747-5276 312-747-9169 1193131.299 1837090.265 41.707933 -87.568349 (41.7079332906, -87.5683491228) 0101000020E61000005C26DC8E9DDA444010C7FCD45FE4...
4 5 Calumet 727 E 111th St Chicago IL 60628 http://home.chicagopolice.org/community/distri... 312-747-8210 312-747-5935 312-747-9170 1183305.427 1831462.313 41.692723 -87.604506 (41.6927233639, -87.6045058667) 0101000020E61000000830B828ABD84440F6786339B0E6...
In [12]:
# Another way to get the police station table

query = "SELECT * \
        FROM police_stations"
cursor.execute(query)
data = cursor.fetchall()
colnames = [desc[0] for desc in cursor.description]
# create a data frame with column headers
stations = pd.DataFrame(data,columns = colnames).drop_duplicates()

stations
Out[12]:
district district_name address city state zip website phone fax tty x_coordinate y_coordinate latitude longitude location where_is
0 1 Central 1718 S State St Chicago IL 60616 http://home.chicagopolice.org/community/distri... 312-745-4290 312-745-3694 312-745-3693 1176569.052 1891771.704 41.858373 -87.627356 (41.8583725929, -87.627356171) 0101000020E61000009AEA2C27DFED444001467E9A26E8...
1 2 Wentworth 5101 S Wentworth Ave Chicago IL 60609 http://home.chicagopolice.org/community/distri... 312-747-8366 312-747-5396 312-747-6656 1175864.837 1871153.753 41.801811 -87.630560 (41.8018110912, -87.6305601801) 0101000020E6100000608FECBEA1E64440A2D015195BE8...
2 3 Grand Crossing 7040 S Cottage Grove Ave Chicago IL 60637 http://home.chicagopolice.org/community/distri... 312-747-8201 312-747-5479 312-747-9168 1182739.183 1858317.732 41.766431 -87.605748 (41.7664308925, -87.6057478606) 0101000020E6100000DB984B681AE244409B14AC92C4E6...
3 4 South Chicago 2255 E 103rd St Chicago IL 60617 http://home.chicagopolice.org/community/distri... 312-747-7581 312-747-5276 312-747-9169 1193131.299 1837090.265 41.707933 -87.568349 (41.7079332906, -87.5683491228) 0101000020E61000005C26DC8E9DDA444010C7FCD45FE4...
4 5 Calumet 727 E 111th St Chicago IL 60628 http://home.chicagopolice.org/community/distri... 312-747-8210 312-747-5935 312-747-9170 1183305.427 1831462.313 41.692723 -87.604506 (41.6927233639, -87.6045058667) 0101000020E61000000830B828ABD84440F6786339B0E6...
5 6 Gresham 7808 S Halsted St Chicago IL 60620 http://home.chicagopolice.org/community/distri... 312-745-3617 312-745-3649 312-745-3639 1172283.013 1853022.646 41.752137 -87.644229 (41.7521368378, -87.6442289066) 0101000020E610000059F51C0546E044409DE5E40B3BE9...
6 7 Englewood 1438 W 63rd St Chicago IL 60636 http://home.chicagopolice.org/community/distri... 312-747-8223 312-747-6558 312-747-6652 1167659.235 1863005.522 41.779632 -87.660887 (41.7796315359, -87.6608870173) 0101000020E6100000749D5FF7CAE344400F5012F94BEA...
7 8 Chicago Lawn 3420 W 63rd St Chicago IL 60629 http://home.chicagopolice.org/community/distri... 312-747-8730 312-747-8545 312-747-8116 1154575.242 1862672.049 41.778987 -87.708864 (41.778987189, -87.7088638153) 0101000020E6100000C7862CDAB5E34440EE0D5B065EED...
8 9 Deering 3120 S Halsted St Chicago IL 60608 http://home.chicagopolice.org/community/distri... 312-747-8227 312-747-5329 312-747-9172 1171440.240 1884085.224 41.837394 -87.646408 (41.8373944311, -87.6464077068) 0101000020E6100000EE599DBD2FEB4440449571BE5EE9...
9 10 Ogden 3315 W Ogden Ave Chicago IL 60623 http://home.chicagopolice.org/community/distri... 312-747-7511 312-747-7429 312-747-7471 1154500.753 1890985.501 41.856685 -87.708382 (41.8566845327, -87.708381958) 0101000020E610000069ABB3D6A7ED4440B0D1492156ED...
10 11 Harrison 3151 W Harrison St Chicago IL 60612 http://home.chicagopolice.org/community/distri... 312-746-8386 312-746-4281 312-746-5151 1155244.069 1897148.755 41.873582 -87.705488 (41.8735822883, -87.705488126) 0101000020E610000018F5628BD1EF44403D84AFB726ED...
11 12 Near West 1412 S Blue Island Ave\n Chicago IL 60608 http://home.chicagopolice.org/community/distri... 312-746-8396 312-746-4248 312-746-9868 1168487.845 1893384.455 41.862977 -87.656973 (41.8629766244, -87.6569725149) 0101000020E61000006A0E940476EE44408C346DD60BEA...
12 14 Shakespeare 2150 N California Ave Chicago IL 60647 http://home.chicagopolice.org/community/distri... 312-744-8250 312-744-2422 312-744-8260 1157304.426 1914481.521 41.921103 -87.697452 (41.9211033246, -87.6974518223) 0101000020E61000008AD1ADB6E6F54440E25BF50CA3EC...
13 15 Austin 5701 W Madison St Chicago IL 60644 http://home.chicagopolice.org/community/distri... 312-743-1440 312-743-1366 312-743-1485 1138148.815 1899399.078 41.880083 -87.768200 (41.8800834614, -87.768199889) 0101000020E6100000AB392793A6F04440D015DF2F2AF1...
14 16 Jefferson Park 5151 N Milwaukee Ave Chicago IL 60630 http://home.chicagopolice.org/community/distri... 312-742-4480 312-742-4421 312-742-4423 1138480.758 1933660.473 41.974094 -87.766149 (41.9740944511, -87.7661488432) 0101000020E610000089FB7E20AFFC4440C6EA249508F1...
15 17 Albany Park 4650 N Pulaski Rd Chicago IL 60630 http://home.chicagopolice.org/community/distri... 312-742-4410 312-742-5411 312-742-5451 1148843.910 1930801.058 41.966053 -87.728115 (41.9660534171, -87.728114561) 0101000020E6100000638B72A3A7FB444040BCCF6D99EE...
16 18 Near North 1160 N Larrabee St Chicago IL 60610 http://home.chicagopolice.org/community/distri... 312-742-5870 312-742-5771 312-742-5773 1172080.029 1908086.527 41.903242 -87.643352 (41.9032416531, -87.6433521393) 0101000020E61000004A91216C9DF34440254774AE2CE9...
17 19 Town Hall 850 W Addison St Chicago IL 60613 http://home.chicagopolice.org/community/distri... 312-744-8320 312-744-4481 312-744-8011 1169730.744 1924160.317 41.947400 -87.651512 (41.9474004564, -87.651512018) 0101000020E61000002AF5136B44F9444075B6785FB2E9...
18 20 Lincoln 5400 N Lincoln Ave Chicago IL 60625 http://home.chicagopolice.org/community/distri... 312-742-8714 312-742-8803 312-742-8841 1158399.146 1935788.826 41.979550 -87.692845 (41.9795495131, -87.6928445094) 0101000020E6100000A421DBE061FD444071EA7F9057EC...
19 22 Morgan Park 1900 W Monterey Ave Chicago IL 60643 http://home.chicagopolice.org/community/distri... 312-745-0710 312-745-0814 312-745-0569 1165825.476 1830851.333 41.691435 -87.668520 (41.6914347795, -87.6685203937) 0101000020E61000005FB553EF80D8444096F0BE09C9EA...
20 24 Rogers Park 6464 N Clark St Chicago IL 60626 http://home.chicagopolice.org/community/distri... 312-744-5907 312-744-6928 312-744-7603 1164193.588 1943199.401 41.999763 -87.671324 (41.9997634842, -87.6713242922) 0101000020E6100000E62AED3FF8FF4440DEA327FAF6EA...
21 25 Grand Central 5555 W Grand Ave Chicago IL 60639 http://home.chicagopolice.org/community/distri... 312-746-8605 312-746-4353 312-746-8383 1138770.871 1913442.439 41.918609 -87.765574 (41.9186088912, -87.765574479) 0101000020E6100000B42EE2F994F5444024901A2CFFF0...
22 Headquarters Headquarters 3510 S Michigan Ave Chicago IL 60653 http://home.chicagopolice.org 1177731.401 1881697.404 41.830702 -87.623395 (41.8307016873, -87.6233953459) 0101000020E610000016A4D76E54EA44409F2E9CB5E5E7...
In [13]:
# store districts (excluding Headquarters) for later use
districts = stations.query("district != 'Headquarters'").district
districts
Out[13]:
0      1
1      2
2      3
3      4
4      5
5      6
6      7
7      8
8      9
9     10
10    11
11    12
12    14
13    15
14    16
15    17
16    18
17    19
18    20
19    22
20    24
21    25
Name: district, dtype: object
In [14]:
# Display the first 5 rows in the crimes table

query = "SELECT * \
        FROM crimes"
chicago = pd.read_sql_query(query, db_connection)
chicago.head()
Out[14]:
id caseno date_of_occurrence block iucr primary_type description location_description arrest domestic ... community_area fbi_cd xcoordinate ycoordinate year updated_on latitude longitude location where_is
0 24368 JC103046 2019-01-03 14:37:00 050XX S INDIANA AVE 110 HOMICIDE FIRST DEGREE MURDER AUTO False False ... 38 01A 1178455.0 1871575.0 2019 2019-01-10 15:16:00 41.802909 -87.621048 (41.802908524, -87.62104831) 0101000020E6100000368AD5B4C5E64440E92B6941BFE7...
1 24369 JC103046 2019-01-03 14:37:00 050XX S INDIANA AVE 110 HOMICIDE FIRST DEGREE MURDER AUTO False False ... 38 01A 1178455.0 1871575.0 2019 2019-01-10 15:16:00 41.802909 -87.621048 (41.802908524, -87.62104831) 0101000020E6100000368AD5B4C5E64440E92B6941BFE7...
2 24370 JC105864 2019-01-05 16:23:00 012XX N ADA ST 110 HOMICIDE FIRST DEGREE MURDER STREET False False ... 24 01A 1167119.0 1908611.0 2019 2019-01-18 09:37:00 41.904789 -87.661560 (41.904788891, -87.66155984) 0101000020E6100000C52A521FD0F34440684915FF56EA...
3 24372 JC107385 2019-01-06 22:23:00 060XX W 63RD ST 110 HOMICIDE FIRST DEGREE MURDER STREET False False ... 64 01A 1137054.0 1862130.0 2019 2019-01-18 09:37:00 41.777831 -87.773112 (41.77783091, -87.773112387) 0101000020E61000004B2298F68FE34440ACCB63AC7AF1...
4 24373 JC108358 2019-01-07 15:45:00 002XX W 87TH ST 110 HOMICIDE FIRST DEGREE MURDER PARKING LOT True False ... 44 01A 1176442.0 1847223.0 2019 2019-01-18 09:37:00 41.736129 -87.629162 (41.736129473, -87.629162044) 0101000020E6100000CBA28F7D39DE4440D16BDC3044E8...

5 rows × 23 columns

In [15]:
# Another way to get the crimes table

query = "SELECT * \
        FROM crimes"
cursor.execute(query)
data = cursor.fetchall()
colnames = [desc[0] for desc in cursor.description]
# create a data frame with column headers
chicago = pd.DataFrame(data,columns = colnames)
chicago.columns
Out[15]:
Index(['id', 'caseno', 'date_of_occurrence', 'block', 'iucr', 'primary_type',
       'description', 'location_description', 'arrest', 'domestic', 'beat',
       'district', 'ward', 'community_area', 'fbi_cd', 'xcoordinate',
       'ycoordinate', 'year', 'updated_on', 'latitude', 'longitude',
       'location', 'where_is'],
      dtype='object')

DATA DICTIONARY (CRIMES) (See https://www.kaggle.com/currie32/crimes-in-chicago)

id - Unique identifier for the record.

caseno - The Chicago Police Department RD Number (Records Division Number), which is unique to the incident.

date_of_occurence - Date when the incident occurred. this is sometimes a best estimate.

**block - The partially redacted address where the incident occurred, placing it on the same block as the actual address.

iucr - The Illinois Unifrom Crime Reporting code. This is directly linked to the Primary Type and Description. See the list of IUCR codes at https://data.cityofchicago.org/d/c7ck-438e.

primary_type - The primary description of the IUCR code.

description - The secondary description of the IUCR code, a subcategory of the primary description.

location-description - Description of the location where the incident occurred.

arrest - Indicates whether an arrest was made.

domestic - Indicates whether the incident was domestic-related as defined by the Illinois Domestic Violence Act.

beat - Indicates the beat where the incident occurred. A beat is the smallest police geographic area – each beat has a dedicated police beat car. Three to five beats make up a police sector, and three sectors make up a police district. The Chicago Police Department has 22 police districts. See the beats at https://data.cityofchicago.org/d/aerh-rz74.

district - Indicates the police district where the incident occurred. See the districts at https://data.cityofchicago.org/d/fthy-xz3r.

ward - The ward (City Council district) where the incident occurred. See the wards at https://data.cityofchicago.org/d/sp34-6z76.

community_area - Indicates the community area where the incident occurred. Chicago has 77 community areas. See the community areas at https://data.cityofchicago.org/d/cauq-8yn6.

fbi_cd - Indicates the crime classification as outlined in the FBI's National Incident-Based Reporting System (NIBRS). See the Chicago Police Department listing of these classifications at http://gis.chicagopolice.org/clearmap_crime_sums/crime_types.html.

xcoordinate - The x coordinate of the location where the incident occurred in State Plane Illinois East NAD 1983 projection. This location is shifted from the actual location for partial redaction but falls on the same block.

ycoordinate - The y coordinate of the location where the incident occurred in State Plane Illinois East NAD 1983 projection. This location is shifted from the actual location for partial redaction but falls on the same block.

year - Year the incident occurred.

updated_on - Date and time the record was last updated.

latitude - The latitude of the location where the incident occurred. This location is shifted from the actual location for partial redaction but falls on the same block.

longitude - The longitude of the location where the incident occurred. This location is shifted from the actual location for partial redaction but falls on the same block.

location - The location where the incident occurred in a format that allows for creation of maps and other geographic operations on this data portal. This location is shifted from the actual location for partial redaction but falls on the same block.

In [16]:
chicago.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14436 entries, 0 to 14435
Data columns (total 23 columns):
 #   Column                Non-Null Count  Dtype         
---  ------                --------------  -----         
 0   id                    14436 non-null  int64         
 1   caseno                14436 non-null  object        
 2   date_of_occurrence    14436 non-null  datetime64[ns]
 3   block                 14436 non-null  object        
 4   iucr                  14436 non-null  object        
 5   primary_type          14436 non-null  object        
 6   description           14436 non-null  object        
 7   location_description  14436 non-null  object        
 8   arrest                14436 non-null  bool          
 9   domestic              14436 non-null  bool          
 10  beat                  14436 non-null  int64         
 11  district              14436 non-null  int64         
 12  ward                  14436 non-null  float64       
 13  community_area        14436 non-null  object        
 14  fbi_cd                14436 non-null  object        
 15  xcoordinate           14436 non-null  float64       
 16  ycoordinate           14436 non-null  float64       
 17  year                  14436 non-null  int64         
 18  updated_on            14436 non-null  datetime64[ns]
 19  latitude              14436 non-null  float64       
 20  longitude             14436 non-null  float64       
 21  location              14436 non-null  object        
 22  where_is              14436 non-null  object        
dtypes: bool(2), datetime64[ns](2), float64(5), int64(4), object(10)
memory usage: 2.3+ MB
In [17]:
chicago.head()
Out[17]:
id caseno date_of_occurrence block iucr primary_type description location_description arrest domestic ... community_area fbi_cd xcoordinate ycoordinate year updated_on latitude longitude location where_is
0 24368 JC103046 2019-01-03 14:37:00 050XX S INDIANA AVE 110 HOMICIDE FIRST DEGREE MURDER AUTO False False ... 38 01A 1178455.0 1871575.0 2019 2019-01-10 15:16:00 41.802909 -87.621048 (41.802908524, -87.62104831) 0101000020E6100000368AD5B4C5E64440E92B6941BFE7...
1 24369 JC103046 2019-01-03 14:37:00 050XX S INDIANA AVE 110 HOMICIDE FIRST DEGREE MURDER AUTO False False ... 38 01A 1178455.0 1871575.0 2019 2019-01-10 15:16:00 41.802909 -87.621048 (41.802908524, -87.62104831) 0101000020E6100000368AD5B4C5E64440E92B6941BFE7...
2 24370 JC105864 2019-01-05 16:23:00 012XX N ADA ST 110 HOMICIDE FIRST DEGREE MURDER STREET False False ... 24 01A 1167119.0 1908611.0 2019 2019-01-18 09:37:00 41.904789 -87.661560 (41.904788891, -87.66155984) 0101000020E6100000C52A521FD0F34440684915FF56EA...
3 24372 JC107385 2019-01-06 22:23:00 060XX W 63RD ST 110 HOMICIDE FIRST DEGREE MURDER STREET False False ... 64 01A 1137054.0 1862130.0 2019 2019-01-18 09:37:00 41.777831 -87.773112 (41.77783091, -87.773112387) 0101000020E61000004B2298F68FE34440ACCB63AC7AF1...
4 24373 JC108358 2019-01-07 15:45:00 002XX W 87TH ST 110 HOMICIDE FIRST DEGREE MURDER PARKING LOT True False ... 44 01A 1176442.0 1847223.0 2019 2019-01-18 09:37:00 41.736129 -87.629162 (41.736129473, -87.629162044) 0101000020E6100000CBA28F7D39DE4440D16BDC3044E8...

5 rows × 23 columns

In [18]:
# use value_counts to see the values of the field
chicago['year'].value_counts()
Out[18]:
2019    14436
Name: year, dtype: int64
In [19]:
# what is in the field domestic
chicago['domestic'].value_counts()
Out[19]:
False    11974
True      2462
Name: domestic, dtype: int64
In [20]:
# let's look at how many crimes are domestic by primary type
pd.crosstab(chicago['primary_type'], chicago['domestic'], margins=True)
Out[20]:
domestic False True All
primary_type
ARSON 18 1 19
ASSAULT 774 317 1091
BATTERY 1328 1398 2726
BURGLARY 630 7 637
CONCEALED CARRY LICENSE VIOLATION 7 0 7
CRIM SEXUAL ASSAULT 79 15 94
CRIMINAL DAMAGE 1289 168 1457
CRIMINAL TRESPASS 436 25 461
DECEPTIVE PRACTICE 785 5 790
GAMBLING 3 0 3
HOMICIDE 15 0 15
HUMAN TRAFFICKING 3 0 3
INTERFERENCE WITH PUBLIC OFFICER 81 0 81
INTIMIDATION 12 4 16
KIDNAPPING 4 0 4
LIQUOR LAW VIOLATION 11 0 11
MOTOR VEHICLE THEFT 483 5 488
NARCOTICS 933 0 933
NON-CRIMINAL 1 0 1
OBSCENITY 0 1 1
OFFENSE INVOLVING CHILDREN 50 49 99
OTHER OFFENSE 678 326 1004
PROSTITUTION 58 0 58
PUBLIC PEACE VIOLATION 63 1 64
ROBBERY 513 11 524
SEX OFFENSE 41 3 44
STALKING 7 2 9
THEFT 3300 124 3424
WEAPONS VIOLATION 372 0 372
All 11974 2462 14436
In [21]:
# percentages would be more informative
# ------ lambda expressions section 4.7.6 ('syntactic sugar for a normal function definition')
# https://docs.python.org/3/tutorial/controlflow.html
pd.crosstab(chicago['primary_type'], chicago['domestic']).apply(lambda x: x/x.sum(), axis = 1)
Out[21]:
domestic False True
primary_type
ARSON 0.947368 0.052632
ASSAULT 0.709441 0.290559
BATTERY 0.487161 0.512839
BURGLARY 0.989011 0.010989
CONCEALED CARRY LICENSE VIOLATION 1.000000 0.000000
CRIM SEXUAL ASSAULT 0.840426 0.159574
CRIMINAL DAMAGE 0.884695 0.115305
CRIMINAL TRESPASS 0.945770 0.054230
DECEPTIVE PRACTICE 0.993671 0.006329
GAMBLING 1.000000 0.000000
HOMICIDE 1.000000 0.000000
HUMAN TRAFFICKING 1.000000 0.000000
INTERFERENCE WITH PUBLIC OFFICER 1.000000 0.000000
INTIMIDATION 0.750000 0.250000
KIDNAPPING 1.000000 0.000000
LIQUOR LAW VIOLATION 1.000000 0.000000
MOTOR VEHICLE THEFT 0.989754 0.010246
NARCOTICS 1.000000 0.000000
NON-CRIMINAL 1.000000 0.000000
OBSCENITY 0.000000 1.000000
OFFENSE INVOLVING CHILDREN 0.505051 0.494949
OTHER OFFENSE 0.675299 0.324701
PROSTITUTION 1.000000 0.000000
PUBLIC PEACE VIOLATION 0.984375 0.015625
ROBBERY 0.979008 0.020992
SEX OFFENSE 0.931818 0.068182
STALKING 0.777778 0.222222
THEFT 0.963785 0.036215
WEAPONS VIOLATION 1.000000 0.000000

Lets start executing different Queries

Query #1:

  • Calculate the total number of crimes in every district and plot that on Choropleth map. In other words, create a Choropleth map where the districts are shaded in proportion to the number of crimes in that district.
In [22]:
# Get the total number of crimes per district. 
# Since each row corresponds to a crime, we just need to count the number of rows per district.
cursor.execute("SELECT district, count(district) \
                FROM crimes \
                GROUP BY district")
rows = cursor.fetchall()
In [23]:
# Create a dataframe from the results of the query.
crimes_per_district = pd.DataFrame(rows, columns=['dist_num','number_of_crimes'])
crimes_per_district['dist_num'] = crimes_per_district['dist_num'].astype(str)

crimes_per_district
Out[23]:
dist_num number_of_crimes
0 24 412
1 11 1113
2 8 886
3 19 608
4 25 726
5 4 790
6 14 508
7 3 695
8 17 386
9 20 254
10 22 442
11 9 585
12 10 779
13 7 745
14 1 854
15 5 635
16 18 869
17 2 602
18 16 473
19 15 530
20 6 844
21 12 700
In [24]:
# For each district, excluding HQ, we get the latitude/longitude of the police station at the district.
# With a few exceptions, will be inserting markers on the map at these police station locations
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)), district \
            FROM police_stations \
            WHERE district!='Headquarters'""")
police_stations = cursor.fetchall()

# Create a dataframe with this information.
station_locations = pd.DataFrame(police_stations,columns=['st_latitude','st_longitude','dist_num']).drop_duplicates()

station_locations
Out[24]:
st_latitude st_longitude dist_num
0 41.858373 -87.627356 1
1 41.801811 -87.630560 2
2 41.766431 -87.605748 3
3 41.707933 -87.568349 4
4 41.692723 -87.604506 5
5 41.752137 -87.644229 6
6 41.779632 -87.660887 7
7 41.778987 -87.708864 8
8 41.837394 -87.646408 9
9 41.856685 -87.708382 10
10 41.873582 -87.705488 11
11 41.862977 -87.656973 12
12 41.921103 -87.697452 14
13 41.880083 -87.768200 15
14 41.974094 -87.766149 16
15 41.966053 -87.728115 17
16 41.903242 -87.643352 18
17 41.947400 -87.651512 19
18 41.979550 -87.692845 20
19 41.691435 -87.668520 22
20 41.999763 -87.671324 24
21 41.918609 -87.765574 25
In [25]:
# Add columns for the coordinates of each distract. Again, this is going to be the locations of our markers.
query1_df = pd.merge(crimes_per_district, station_locations,on='dist_num')
query1_df
Out[25]:
dist_num number_of_crimes st_latitude st_longitude
0 24 412 41.999763 -87.671324
1 11 1113 41.873582 -87.705488
2 8 886 41.778987 -87.708864
3 19 608 41.947400 -87.651512
4 25 726 41.918609 -87.765574
5 4 790 41.707933 -87.568349
6 14 508 41.921103 -87.697452
7 3 695 41.766431 -87.605748
8 17 386 41.966053 -87.728115
9 20 254 41.979550 -87.692845
10 22 442 41.691435 -87.668520
11 9 585 41.837394 -87.646408
12 10 779 41.856685 -87.708382
13 7 745 41.779632 -87.660887
14 1 854 41.858373 -87.627356
15 5 635 41.692723 -87.604506
16 18 869 41.903242 -87.643352
17 2 602 41.801811 -87.630560
18 16 473 41.974094 -87.766149
19 15 530 41.880083 -87.768200
20 6 844 41.752137 -87.644229
21 12 700 41.862977 -87.656973

Creating the folium maps

In [26]:
# Create a folium map centered on "downtown Chicago"
total_number_of_crimes_per_district_map = folium.Map(location = (41.8781, -87.6298),zoom_start = 11)
In [27]:
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". 
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
folium.Choropleth(geo_data = "Boundaries.geojson", 
              fill_color = 'OrRd', 
              fill_opacity  =0.5, 
              line_opacity= 1,
              data = crimes_per_district,
              key_on = 'feature.properties.dist_num',
              columns = ['dist_num', 'number_of_crimes'],
              legend_name = "CRIMES PER DISTRICT"
              ).add_to(total_number_of_crimes_per_district_map)
Out[27]:
<folium.features.Choropleth at 0x7fc6289e7f10>
In [28]:
# We iterate over the rows in the dataframe creating one marker per row.
# The ("popup") label for each district marker displays the district number and number of crimes.
# Each distric marker is added to the map at the district station's location.
for index, row in query1_df.iterrows():
    # location of police station
    police_station_location = (row['st_latitude'],row['st_longitude'])
    district = row['dist_num']
    num_crimes = row['number_of_crimes']
            
    # Create the marker and add it to the map.
    folium.Marker(location = police_station_location, 
    popup =\
    folium.Popup(html=f"District No : {district} has Total Number of Crimes: {num_crimes}",max_width=450)).\
    add_to(total_number_of_crimes_per_district_map)
  • Lets plot the Choropleth map and notice the intensity of color on the different districts
  • The Blue POPUP represents the location of police station in the different districts in the map
In [29]:
# Display the finished map.
total_number_of_crimes_per_district_map
Out[29]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #2:

  • Calculate the total number of violent crimes in every district and plot that in a table on Choropleth map. In other words, create a Choropleth map where the districts are shaded in proportion to the number of violent crimes in that district.

Note: A crime is considered a violent crime if the PRIMARY_TYPE of the crimes is THEFT ,ASSAULT,ROBBER,KIDNAPPING,CRIM SEXUAL ASSAULT,BATTERY, or MURDER.

  • Then find the total number of crimes in the district for each of these primary types of violent crime and add a popup marker (located at that district's police headquarter) that displays a DataFrame containing this data.

violent_crime_df.png

In [30]:
violent_crime_categories = 'THEFT','ASSAULT','ROBBERY','KIDNAPPING','CRIM SEXUAL ASSAULT','BATTERY','MURDER'
In [31]:
# Get the total number of violent  crimes per district. 
cursor.execute("SELECT district, count(district)\
                FROM crimes \
                WHERE PRIMARY_TYPE in %s \
                GROUP BY district",[violent_crime_categories])
rows=cursor.fetchall()
violent_crime_data=pd.DataFrame(rows, columns=['dist_num','number_of_violent_crimes'])
violent_crime_data['dist_num'] = violent_crime_data['dist_num'].astype(str)
violent_crime_data
Out[31]:
dist_num number_of_violent_crimes
0 8 488
1 11 407
2 24 216
3 19 368
4 25 362
5 4 400
6 14 309
7 3 387
8 17 220
9 20 147
10 22 217
11 9 312
12 7 367
13 10 362
14 1 624
15 5 293
16 18 605
17 2 349
18 16 252
19 15 291
20 6 457
21 12 430

Now, lets create a dataframe of the different types of violent crimes for every district

In [32]:
cursor.execute(f"SELECT district, PRIMARY_TYPE, count(PRIMARY_TYPE) \
                    FROM crimes \
                    WHERE PRIMARY_TYPE in {violent_crime_categories} \
                    GROUP BY distrIct,PRIMARY_TYPE")
rows=cursor.fetchall()
violent_crime_type = pd.DataFrame(rows, columns=['dist_num','Description','number_of_violent_crimes'])
violent_crime_type['dist_num'] = violent_crime_type['dist_num'].astype(str)
violent_crime_type
Out[32]:
dist_num Description number_of_violent_crimes
0 24 ROBBERY 15
1 3 CRIM SEXUAL ASSAULT 9
2 8 BATTERY 174
3 12 ASSAULT 38
4 18 THEFT 417
... ... ... ...
108 5 BATTERY 123
109 10 ASSAULT 63
110 18 ROBBERY 33
111 5 ASSAULT 62
112 22 CRIM SEXUAL ASSAULT 2

113 rows × 3 columns

In [33]:
# This time our dataframe will contain one row for each district and each type of violent crime.
# For each distrct number and type of violent crime our dataframe will have the total of violent crimes of that type.
# We once again include the stations location since that is where the markers will appear on the map.
query2_df = pd.merge(violent_crime_type, station_locations,on='dist_num')
query2_df
Out[33]:
dist_num Description number_of_violent_crimes st_latitude st_longitude
0 24 ROBBERY 15 41.999763 -87.671324
1 24 THEFT 96 41.999763 -87.671324
2 24 BATTERY 79 41.999763 -87.671324
3 24 ASSAULT 23 41.999763 -87.671324
4 24 CRIM SEXUAL ASSAULT 3 41.999763 -87.671324
... ... ... ... ... ...
108 14 CRIM SEXUAL ASSAULT 3 41.921103 -87.697452
109 14 ASSAULT 31 41.921103 -87.697452
110 14 ROBBERY 16 41.921103 -87.697452
111 14 BATTERY 48 41.921103 -87.697452
112 14 THEFT 211 41.921103 -87.697452

113 rows × 5 columns

In [34]:
districts
Out[34]:
0      1
1      2
2      3
3      4
4      5
5      6
6      7
7      8
8      9
9     10
10    11
11    12
12    14
13    15
14    16
15    17
16    18
17    19
18    20
19    22
20    24
21    25
Name: district, dtype: object

Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". Use this data to draw the district regions on the folium map. Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes. Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district. We add this "chropleth layer" to our folium map.

Creating the folium maps

In [35]:
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". 
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map
violent_crimes_per_district_map= folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = violent_crime_data, #data source changed from first example
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'number_of_violent_crimes'],
              legend_name = "VIOLENT CRIMES PER DISTRICT"
              ).add_to(violent_crimes_per_district_map)
Out[35]:
<folium.features.Choropleth at 0x7fc628db10a0>

In addition, for each district find the block(s) that has the highest number of gun crimes in that district. Note that there might be a tie for the highest number of gun crimes. You need to find all such blocks. Add a popup marker (located at that district's police headquarter) that displays a DataFrame containing all such block along with the number of gun crimes for that block (i.e. the highest number of crimes for a district).

In [36]:
# We iterate over district numbers and get all the rows in query2_df with that district number.
# This new dataframe contains the number of crimes for each of violent crime types in a particular district.
# We convert this dataframe to "html" which will be displayed in marker label (along with the district number)
# Each distric marker is added to the map at the district station's location like before.

#districts = query2_df.dist_num.unique()
for district in districts:
    df2d = query2_df[query2_df.dist_num == district]  # new dataframe to add to marker.

    police_station_location = tuple(df2d[['st_latitude','st_longitude']].values[0])

    violent_crimes_per_district_df = df2d[['Description','number_of_violent_crimes']]
    # Convert out new data frame to html before inserting it into the label.
    header = violent_crimes_per_district_df.to_html(classes=
                    'table table-striped table-hover table-condensed table-responsive')
    # Create the folium marker and add it to the map.
    folium.Marker(location=police_station_location, popup=folium.\
                  Popup(html=f"District Number {district} - Violent Crimes {header}")).\
                  add_to(violent_crimes_per_district_map)

    # Create the marker and add it to the map.
    folium.Marker(location = police_station_location, 
    popup =\
    folium.Popup(html=f"District No : {district} - Violent Crimes {header}",max_width=450)).\
    add_to(violent_crimes_per_district_map)
In [37]:
# Display the finished map.
violent_crimes_per_district_map
Out[37]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #3:

  • Calculate the total number of gun related violent crimes in every district and plot that in a table on Choropleth map.

Note: A crime is considered a gun related violent crime if the word "gun" is contained in the DESCRIPTION and the PRIMARY_TYPE of the crimes is THEFT ,ASSAULT,ROBBER,KIDNAPPING,CRIM SEXUAL ASSAULT,BATTERY, or MURDER.

  • Then find the total number of crimes in the district for the different DESCRIPTIONs containing the word "gun" and add a popup marker (located at that district's police headquarter) that displays a DataFrame containing this data.

Lets first create a dataframe of gun crimes per district first to get an idea about the number of gun crimes per district

In [38]:
# isolate violent gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
                FROM crimes\
                WHERE PRIMARY_TYPE in {violent_crime_categories}\
                AND DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_violent_gun_crimes = cursor.fetchall()
districts_violent_gun_crimes_df = pd.DataFrame(districts_violent_gun_crimes, columns=['dist_num','violent_gun_crimes'])
districts_violent_gun_crimes_df['dist_num'] = districts_violent_gun_crimes_df['dist_num'].astype(str)
districts_violent_gun_crimes_df.head()
Out[38]:
dist_num violent_gun_crimes
0 24 9
1 8 38
2 11 36
3 19 9
4 25 25

Now, lets create a dataframe of the different types of violent gun crimes for every district

In [39]:
cursor.execute(f"SELECT district, DESCRIPTION, count(DESCRIPTION) \
                FROM crimes \
                WHERE PRIMARY_TYPE in {violent_crime_categories} AND DESCRIPTION::text LIKE '{gun}' \
                GROUP BY district, DESCRIPTION")
rows=cursor.fetchall()
violent_gun_crime_type = pd.DataFrame(rows, columns=['dist_num','Description','number_of_violent_gun_crimes'])
violent_gun_crime_type['dist_num'] = violent_gun_crime_type['dist_num'].astype(str)
violent_gun_crime_type.head()
Out[39]:
dist_num Description number_of_violent_gun_crimes
0 8 ARMED: HANDGUN 23
1 11 AGGRAVATED: HANDGUN 15
2 6 AGGRAVATED: HANDGUN 15
3 20 ATTEMPT: ARMED-HANDGUN 1
4 19 AGGRAVATED: HANDGUN 5
In [40]:
query3_df = pd.merge(violent_gun_crime_type, station_locations, on = 'dist_num')
query3_df.head()
Out[40]:
dist_num Description number_of_violent_gun_crimes st_latitude st_longitude
0 8 ARMED: HANDGUN 23 41.778987 -87.708864
1 8 ATTEMPT: ARMED-HANDGUN 2 41.778987 -87.708864
2 8 AGGRAVATED: HANDGUN 13 41.778987 -87.708864
3 11 AGGRAVATED: HANDGUN 15 41.873582 -87.705488
4 11 ARMED: HANDGUN 19 41.873582 -87.705488

Creating the folium maps

In [41]:
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". 
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.# Create a folium map centered on "downtown Chicago"
districts_violent_gun_crimes_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_violent_gun_crimes_df,
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'violent_gun_crimes'],
              legend_name="VIOLENT GUN CRIME"
              ).add_to(districts_violent_gun_crimes_map)
Out[41]:
<folium.features.Choropleth at 0x7fc62a36bf10>
In [42]:
# districts = query3_df.dist_num.unique()
for district in districts:
    df3d = query3_df[query3_df.dist_num == district]
  
    police_station_location = tuple(df3d[['st_latitude','st_longitude']].values[0])

    violent_gun_crimes_per_district_df = df3d[['Description','number_of_violent_gun_crimes']]
    
    header = violent_gun_crimes_per_district_df.to_html(classes=
                    'table table-striped table-hover table-condensed table-responsive')

    folium.Marker(location=police_station_location, popup=folium.\
                  Popup(html=f"District Number {district} - Violent GUN Crimes {header}")).\
                  add_to(districts_violent_gun_crimes_map)

    folium.Marker(location = police_station_location, 
    popup =\
    folium.Popup(html=f"District No : {district} - Violent Gun Crimes {header}",max_width=450)).\
    add_to(districts_violent_gun_crimes_map)
In [43]:
districts_violent_gun_crimes_map
Out[43]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #4:

  • Calculate the crime density per district
Boundaries.geojson: { "type": "FeatureCollection", "features": [ {"type":"Feature","properties":{"dist_num":"17","dist_label":"17TH"}, "geometry": {"type":"MultiPolygon","coordinates": [[[[-87.71067089391354,41.997365655369435], [-87.71066884721016,41.99729359357709], [-87.71066053080999,41.997225765680135], ....
In [44]:
district = []
tarea=[]

with open('Boundaries.geojson') as f:
    data = json.load(f)
    a = data['features'] # a is a list of district data (dictionaries)
    for i in range(len(a)):
        obj = a[i]['geometry'] # list of coordinates defining the ith district boundary
        n = a[i]['properties'] # dictionary with district number and district label keyes for ith district
        district.append(n['dist_num']) # add district number to the district list
        tarea.append(area(obj)/10000) # add the the area (in hectares) to area list

af = pd.DataFrame({'dist_num': district,'district_area_inHectares':tarea})
af['dist_num'] = af['dist_num'].astype(str)
final_data= pd.merge(af, crimes_per_district, on='dist_num', how='inner')
final_data['crime_density'] = round(final_data['number_of_crimes']/(final_data['district_area_inHectares']/100))
final_data
Out[44]:
dist_num district_area_inHectares number_of_crimes crime_density
0 17 2492.727155 386 15.0
1 20 1132.170216 254 22.0
2 19 2225.035732 608 27.0
3 25 2827.989237 726 26.0
4 14 1555.869965 508 33.0
5 7 1688.670732 745 44.0
6 3 1576.063931 695 44.0
7 4 7068.152865 790 11.0
8 6 2099.682124 844 40.0
9 22 3490.416073 442 13.0
10 5 3318.613379 635 19.0
11 24 1406.081387 412 29.0
12 16 8171.776367 473 6.0
13 8 5992.169760 886 15.0
14 18 1215.520046 869 71.0
15 12 2509.453028 700 28.0
16 11 1582.727274 1113 70.0
17 15 989.631393 530 54.0
18 10 2038.988883 779 38.0
19 1 1214.818895 854 70.0
20 9 3505.216898 585 17.0
21 2 1949.690970 602 31.0

Query #5:

  • Create Marker Clusters on Choropleth map for those gun related crimes that resulted in arrest (green icon) and those that didn't (red icon)
In [45]:
# isolate gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
                FROM crimes\
                WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_gun_crimes = cursor.fetchall()
districts_gun_crimes_df = pd.DataFrame(districts_gun_crimes, columns=['dist_num','gun_crimes'])
districts_gun_crimes_df['dist_num'] = districts_gun_crimes_df['dist_num'].astype(str)
districts_gun_crimes_df.head()
Out[45]:
dist_num gun_crimes
0 24 9
1 8 49
2 11 69
3 19 11
4 25 43
In [46]:
# This time our dataframe will contain one row for each district and each block in the district.
# Each row contains information about a particular gun crime including the block in which the crime occured.
# We also save the location of the crime incidendent and whether there was an arrest.
# We will use this information when creating the markers. 
# In particular, markers will be positioned at the crime location instead of police station location.
data= []
for district in districts:
    cursor.execute("""SELECT DISTINCT ON(caseno) caseno, block, DESCRIPTION, count(arrest), arrest,latitude,longitude\
                FROM crimes\
                WHERE district = %s and DESCRIPTION::text LIKE %s \
                GROUP BY caseno, block, DESCRIPTION,arrest, latitude, longitude""",[district,gun])
    results = cursor.fetchall()
    
    # insert the district number as the first element of each result list
    gun_crimes_per_district = [[district]+list(result) for result in results]
    data += (gun_crimes_per_district)

gun_crimes_per_district_df = pd.DataFrame(data, columns=['dist_num','caseno','block',\
'Description','arrest_count', 'arrest', 'latitude', 'longitude'])
gun_crimes_per_district_df['dist_num'] = gun_crimes_per_district_df['dist_num'].astype(str)
gun_crimes_per_district_df.head()
Out[46]:
dist_num caseno block Description arrest_count arrest latitude longitude
0 1 JC105034 004XX W VAN BUREN ST AGGRAVATED: HANDGUN 1 False 41.876794 -87.637817
1 1 JC111601 008XX S PARK TER ATTEMPT: ARMED-HANDGUN 1 False 41.871096 -87.629570
2 1 JC111881 0000X W QUINCY ST ATTEMPT: ARMED-HANDGUN 1 False 41.878802 -87.627917
3 1 JC118978 022XX S MICHIGAN AVE AGGRAVATED: HANDGUN 1 True 41.852284 -87.623790
4 1 JC119606 029XX S DEARBORN ST UNLAWFUL POSS OF HANDGUN 1 True 41.841856 -87.628742

Creating the folium maps

In [47]:
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". 
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
gun_crime_arrests_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_gun_crimes_df, #using data from Query 3
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'gun_crimes'],
              legend_name="GUN CRIME"
              ).add_to(gun_crime_arrests_map)
marker_cluster = MarkerCluster().add_to(gun_crime_arrests_map)
In [48]:
# We create a marker for each gun crime--"green" if there was an arrest and "red" otherwise.

gun='%GUN%'
for district in districts:
    gun_crimes_per_district_df2 = gun_crimes_per_district_df[gun_crimes_per_district_df.dist_num==district]
    
    for index, row in gun_crimes_per_district_df2.iterrows():
        description = row['Description']
        block = row['block']
        loc = (row['latitude'],row['longitude'])
#         print("District No: %s <br> Description: %s <br> Block: %s" %(district,description,block))
        if row['arrest']==True: 
            folium.Marker(location=loc,popup = folium.Popup(html=\
                f"District No: {district} <br> Description: {description} <br> Block: {block}"),\
                          icon=folium.Icon(color='green', icon='ok-sign'),).add_to(marker_cluster)
#             print(f"District {district} green marker at {loc}")
        else:
            folium.Marker(location=loc, popup = folium.Popup(html=
                 f"District No: {district} <br> Description: {description} <br> Block: {block}"),\
                          icon=folium.Icon(color='red',icon='remove-sign'),).add_to(marker_cluster)
 
In [49]:
gun_crime_arrests_map
Out[49]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #6:

  • Plot on Choropleth map the farthest Block that has a gun crime from every police station in every district

Locate the farthest gun crime from the police station in every district. Create a Choropleth map where the districts are shaded in proportion to the number of gun crimes in that district. For each district, find the gun crime that was farthest from police station. Add a pop-up on the Choropleth map to display the district number and the Block where the farthest gun crime occurred. Also add circle marker (of radius 5) at the location of the farthest gun crime.

In [50]:
# This complex query returns for each block in each district the gun crime which occured furthest
# from the police station. We also store this distance together with the police station location.
# Once again we will be positioning the markers at the police station locations.
results= []
for district in districts:
    cursor.execute("""SELECT DISTINCT on (A.block) A.district,A.block, A.where_is,ST_Distance(A.where_is,B.where_is)\
FROM crimes as A, police_stations as B 
WHERE A.district=%s and DESCRIPTION::text LIKE %s and B.district= %s
                         and ST_Distance(A.where_is,B.where_is) 
IN 
    ( SELECT max(dist) 
    FROM 
        (SELECT ST_Distance(A.where_is,B.where_is) as dist 
        FROM crimes as A, police_stations as B 
        WHERE A.district=%s and DESCRIPTION::text LIKE %s and B.district= %s ) as f)""",
               [district,gun,district,district,gun, district])

    farthest_block_gun_crime = list(cursor.fetchall()[0])
    cursor.execute(f"SELECT ST_X(ST_AsText(%s)), ST_Y(ST_AsText(%s))"
                   ,(farthest_block_gun_crime[2],farthest_block_gun_crime[2]))
    farthest_block_gun_crime[2:3] = cursor.fetchall()[0]
    
    results.append(farthest_block_gun_crime)

farthest_gun_crime_df = pd.DataFrame(results,columns=['dist_num','block','crime_lat','crime_long','dist'])
farthest_gun_crime_df['dist_num']= farthest_gun_crime_df['dist_num'].astype(str)
query6_df = pd.merge(farthest_gun_crime_df, station_locations,on='dist_num')
query6_df.head()
 
Out[50]:
dist_num block crime_lat crime_long dist st_latitude st_longitude
0 1 004XX W VAN BUREN ST 41.876794 -87.637817 1171.461889 41.858373 -87.627356
1 2 055XX S EVERETT AVE 41.794425 -87.582519 5365.880774 41.801811 -87.630560
2 3 026XX E 75TH ST 41.759100 -87.560145 5093.598187 41.766431 -87.605748
3 4 101XX S EWING AVE 41.711374 -87.535281 3693.438062 41.707933 -87.568349
4 5 118XX S LAFLIN ST 41.678829 -87.658995 6086.359525 41.692723 -87.604506

Creating the folium maps

In [51]:
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". 
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
farthest_block_gun_crime_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_gun_crimes_df,
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'gun_crimes'],
              legend_name="GUN CRIME"
              ).add_to(farthest_block_gun_crime_map)

 
Out[51]:
<folium.features.Choropleth at 0x7fc62a0964c0>
In [52]:
# This time we place a circle markers at the crime locations in addition to the "regular" markers at the
# police station locations.

for district in districts:

    data = query6_df.loc[query6_df.dist_num==district]
#    print(data)
#     print(data['block'].values)

    block, crime_lat, crime_long, dist, police_lat, police_long =\
    data['block'].values[0], data['crime_lat'].values[0], data['crime_long'].values[0],\
    data['dist'].values[0], data['st_latitude'].values[0], data['st_longitude'].values[0]
    
    folium.CircleMarker((crime_lat,crime_long),radius=5,color='#ff3187',
                       popup=folium.Popup(html=f"District No.: {district} <br> Block:{block}")).\
                       add_to(farthest_block_gun_crime_map)


    folium.Marker(location=(police_lat,police_long),
                  popup=folium.Popup(\
                 html=f"Police Station <br> District No.: {district} <br> Farthest Gun_Crime Block:{block}")).\
                  add_to(farthest_block_gun_crime_map)

farthest_block_gun_crime_map 
Out[52]:
Make this Notebook Trusted to load map: File -> Trust Notebook
# Requirements ** The HTML document your are submitting must have the source code and the output for the following requirements **

Requirement #1: (10 points)

  • Use SQL to select all rows of data with the following fields from crimes - district, ward, arrest, primary_type and location_description.
  • Create a dataframe called new_df with the selected fields.
  • Make sure you name the columns in the dataframe.
  • Show your new_df using info().
In [53]:
cursor.execute("SELECT district, ward, arrest, primary_type, location_description \
                FROM crimes")
new = cursor.fetchall()
new_df = pd.DataFrame(new, columns=['district', 'ward', 'arrest', 'primary_type', 'location_description'])
new_df['district'] = new_df['district'].astype(str)
new_df.info()
new_df.head()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14436 entries, 0 to 14435
Data columns (total 5 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   district              14436 non-null  object 
 1   ward                  14436 non-null  float64
 2   arrest                14436 non-null  bool   
 3   primary_type          14436 non-null  object 
 4   location_description  14436 non-null  object 
dtypes: bool(1), float64(1), object(3)
memory usage: 465.3+ KB
Out[53]:
district ward arrest primary_type location_description
0 2 3.0 False HOMICIDE AUTO
1 2 3.0 False HOMICIDE AUTO
2 14 2.0 False HOMICIDE STREET
3 8 13.0 False HOMICIDE STREET
4 6 21.0 True HOMICIDE PARKING LOT

Requirement #2: (10 points)

  • Using new_df, show a crosstab of primary_type with the variable called arrest; Show the results as percentages.
  • Comment on what the results for the crime type of Assault tell us
In [54]:
pd.crosstab(new_df['primary_type'], new_df['arrest']).apply(lambda x: x/x.sum(), axis = 1)
Out[54]:
arrest False True
primary_type
ARSON 1.000000 0.000000
ASSAULT 0.806599 0.193401
BATTERY 0.798239 0.201761
BURGLARY 0.990581 0.009419
CONCEALED CARRY LICENSE VIOLATION 0.000000 1.000000
CRIM SEXUAL ASSAULT 0.957447 0.042553
CRIMINAL DAMAGE 0.952642 0.047358
CRIMINAL TRESPASS 0.407809 0.592191
DECEPTIVE PRACTICE 0.959494 0.040506
GAMBLING 0.000000 1.000000
HOMICIDE 0.866667 0.133333
HUMAN TRAFFICKING 1.000000 0.000000
INTERFERENCE WITH PUBLIC OFFICER 0.049383 0.950617
INTIMIDATION 1.000000 0.000000
KIDNAPPING 1.000000 0.000000
LIQUOR LAW VIOLATION 0.000000 1.000000
MOTOR VEHICLE THEFT 0.959016 0.040984
NARCOTICS 0.000000 1.000000
NON-CRIMINAL 1.000000 0.000000
OBSCENITY 0.000000 1.000000
OFFENSE INVOLVING CHILDREN 0.979798 0.020202
OTHER OFFENSE 0.767928 0.232072
PROSTITUTION 0.000000 1.000000
PUBLIC PEACE VIOLATION 0.265625 0.734375
ROBBERY 0.954198 0.045802
SEX OFFENSE 0.931818 0.068182
STALKING 0.777778 0.222222
THEFT 0.898364 0.101636
WEAPONS VIOLATION 0.322581 0.677419

This indicates that 80.66% of assaults did NOT result in an arrest whereas the remaining 19.34% of assaults DID result in an arrest.

Requirement #3: (10 points)

  • Calculate the gun crimes density in every district. (See Query 4.)
In [55]:
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
                FROM crimes\
                WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_gun_crimes = cursor.fetchall()
districts_gun_crimes_df = pd.DataFrame(districts_gun_crimes, columns=['dist_num','gun_crimes'])
districts_gun_crimes_df['dist_num'] = districts_gun_crimes_df['dist_num'].astype(str)
#districts_gun_crimes_df.head()

#--------------------------

district = []
tarea=[]

with open('Boundaries.geojson') as f:
    data = json.load(f)
    a = data['features'] # a is a list of district data (dictionaries)
    for i in range(len(a)):
        obj = a[i]['geometry'] # list of coordinates defining the ith district boundary
        n = a[i]['properties'] # dictionary with district number and district label keyes for ith district
        district.append(n['dist_num']) # add district number to the district list
        tarea.append(area(obj)/10000) # add the the area (in hectares) to area list

af = pd.DataFrame({'dist_num': district,'district_area_inHectares':tarea})
af['dist_num'] = af['dist_num'].astype(str)
final_data= pd.merge(af, districts_gun_crimes_df, on='dist_num', how='inner')
final_data['crime_density'] = round(final_data['gun_crimes']/(final_data['district_area_inHectares']/100))
final_data
Out[55]:
dist_num district_area_inHectares gun_crimes crime_density
0 17 2492.727155 7 0.0
1 20 1132.170216 5 0.0
2 19 2225.035732 11 0.0
3 25 2827.989237 43 2.0
4 14 1555.869965 12 1.0
5 7 1688.670732 67 4.0
6 3 1576.063931 57 4.0
7 4 7068.152865 50 1.0
8 6 2099.682124 63 3.0
9 22 3490.416073 24 1.0
10 5 3318.613379 44 1.0
11 24 1406.081387 9 1.0
12 16 8171.776367 12 0.0
13 8 5992.169760 49 1.0
14 18 1215.520046 14 1.0
15 12 2509.453028 38 2.0
16 11 1582.727274 69 4.0
17 15 989.631393 34 3.0
18 10 2038.988883 58 3.0
19 1 1214.818895 6 0.0
20 9 3505.216898 39 1.0
21 2 1949.690970 34 2.0
For Requirements 4 and 5, consider following the steps illustrated in the Queries:
  • Create the data frame with the appropriate marker data.
  • Create the folium base map.
  • Add the Choropleth layer.
  • Iterate over the data frame you created to place the markers on the folium map.
  • Show the final map.

Requirement #4: (20 points)

  • Locate the Block that has the highest number of gun crimes. The popup on Choropleth map shall display the Block in every district along with the total number of gun crimes for that block
In [56]:
# isolate violent gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
                FROM crimes\
                WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_violent_gun_crimes = cursor.fetchall()
districts_violent_gun_crimes_df = pd.DataFrame(districts_violent_gun_crimes, columns=['dist_num','gun_crimes'])
districts_violent_gun_crimes_df['dist_num'] = districts_violent_gun_crimes_df['dist_num'].astype(str)
districts_violent_gun_crimes_df.head()
Out[56]:
dist_num gun_crimes
0 24 9
1 8 49
2 11 69
3 19 11
4 25 43
In [57]:
cursor.execute(f"SELECT district, block, count(*) \
                FROM crimes \
                WHERE DESCRIPTION::text LIKE '{gun}' \
                GROUP BY district, block")
rows=cursor.fetchall()
violent_gun_crime_type = pd.DataFrame(rows, columns=['dist_num','block','number_of_gun_crimes'])
violent_gun_crime_type['dist_num'] = violent_gun_crime_type['dist_num'].astype(str)

violent_gun_crime_type['count_max'] = violent_gun_crime_type.groupby(['dist_num'])['number_of_gun_crimes'].transform(max)
violent_gun_crime_type = violent_gun_crime_type[violent_gun_crime_type['count_max'] == violent_gun_crime_type['number_of_gun_crimes']]

violent_gun_crime_type.sort_values(by=['count_max'], ascending=False).head()
Out[57]:
dist_num block number_of_gun_crimes count_max
568 10 015XX S KOLIN AVE 3 3
692 6 078XX S ASHLAND AVE 3 3
562 3 071XX S WABASH AVE 3 3
464 11 009XX N HAMLIN AVE 3 3
315 12 013XX W 18TH ST 2 2

The 4 blocks with the most gun crimes each have 3 gun crimes.

In [58]:
query3_df = pd.merge(violent_gun_crime_type, station_locations, on = 'dist_num')
query3_df.head()
Out[58]:
dist_num block number_of_gun_crimes count_max st_latitude st_longitude
0 24 072XX N BELL AVE 1 1 41.999763 -87.671324
1 24 068XX N WAYNE AVE 1 1 41.999763 -87.671324
2 24 073XX N DAMEN AVE 1 1 41.999763 -87.671324
3 24 022XX W DEVON AVE 1 1 41.999763 -87.671324
4 24 074XX N WESTERN AVE 1 1 41.999763 -87.671324
In [59]:
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". 
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.# Create a folium map centered on "downtown Chicago"
districts_violent_gun_crimes_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = violent_gun_crime_type,
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'number_of_gun_crimes'],
              legend_name="GUN CRIME PER DISTRICT"
              ).add_to(districts_violent_gun_crimes_map)
Out[59]:
<folium.features.Choropleth at 0x7fc625dd8850>
In [60]:
# districts = query3_df.dist_num.unique()
for district in districts:
    df3d = query3_df[query3_df.dist_num == district]
  
    police_station_location = tuple(df3d[['st_latitude','st_longitude']].values[0])

    violent_gun_crimes_per_district_df = df3d[['block','number_of_gun_crimes']]
    
    header = violent_gun_crimes_per_district_df.to_html(classes=
                    'table table-striped table-hover table-condensed table-responsive')

    folium.Marker(location = police_station_location, 
    popup =\
    folium.Popup(html=f"District No : {district} - Block(s) with the Most Gun Crimes {header}",max_width=450)).\
    add_to(districts_violent_gun_crimes_map)
In [61]:
districts_violent_gun_crimes_map
Out[61]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Requirement #5: (20 points)

  • Create Marker Clusters on Choropleth map for those gun related crimes that have Location Description as RESIDENCE in (green icon) and those that have Location Description as STREET in (red icon) (See Query 5.)
In [62]:
# isolate gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
                FROM crimes\
                WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_gun_crimes = cursor.fetchall()
districts_gun_crimes_df = pd.DataFrame(districts_gun_crimes, columns=['dist_num','gun_crimes'])
districts_gun_crimes_df['dist_num'] = districts_gun_crimes_df['dist_num'].astype(str)
districts_gun_crimes_df.head()
Out[62]:
dist_num gun_crimes
0 24 9
1 8 49
2 11 69
3 19 11
4 25 43
In [63]:
# This time our dataframe will contain one row for each district and each block in the district.
# Each row contains information about a particular gun crime including the block in which the crime occured.
# We also save the location of the crime incidendent and whether there was an arrest.
# We will use this information when creating the markers. 
# In particular, markers will be positioned at the crime location instead of police station location.
data= []
for district in districts:
    cursor.execute("""SELECT DISTINCT ON(caseno) caseno, block, DESCRIPTION, count(location_description), location_description,latitude,longitude\
                FROM crimes\
                WHERE district = %s and DESCRIPTION::text LIKE %s \
                GROUP BY caseno, block, DESCRIPTION,location_description, latitude, longitude""",[district,gun])
    results = cursor.fetchall()
    
    # insert the district number as the first element of each result list
    gun_crimes_per_district = [[district]+list(result) for result in results]
    data += (gun_crimes_per_district)

gun_crimes_per_district_df = pd.DataFrame(data, columns=['dist_num','caseno','block',\
'Description','location_description_COUNT', 'location_description', 'latitude', 'longitude'])
gun_crimes_per_district_df['dist_num'] = gun_crimes_per_district_df['dist_num'].astype(str)
gun_crimes_per_district_df.head()
Out[63]:
dist_num caseno block Description location_description_COUNT location_description latitude longitude
0 1 JC105034 004XX W VAN BUREN ST AGGRAVATED: HANDGUN 1 STREET 41.876794 -87.637817
1 1 JC111601 008XX S PARK TER ATTEMPT: ARMED-HANDGUN 1 PARK PROPERTY 41.871096 -87.629570
2 1 JC111881 0000X W QUINCY ST ATTEMPT: ARMED-HANDGUN 1 RESTAURANT 41.878802 -87.627917
3 1 JC118978 022XX S MICHIGAN AVE AGGRAVATED: HANDGUN 1 RESTAURANT 41.852284 -87.623790
4 1 JC119606 029XX S DEARBORN ST UNLAWFUL POSS OF HANDGUN 1 CHA PARKING LOT/GROUNDS 41.841856 -87.628742
In [64]:
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". 
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
gun_crime_arrests_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_gun_crimes_df, #using data from Query 3
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'gun_crimes'],
              legend_name="GUN CRIME"
              ).add_to(gun_crime_arrests_map)
marker_cluster = MarkerCluster().add_to(gun_crime_arrests_map)
In [65]:
# We create a marker for each gun crime--"green" if there was an RESIDENCE and "red" STREET.

gun='%GUN%'
for district in districts:
    gun_crimes_per_district_df2 = gun_crimes_per_district_df[gun_crimes_per_district_df.dist_num==district]
    
    for index, row in gun_crimes_per_district_df2.iterrows():
        description = row['Description']
        block = row['block']
        x = row['location_description']
        loc = (row['latitude'],row['longitude'])
#         print("District No: %s <br> Description: %s <br> Block: %s" %(district,description,block))
        if row['location_description']=="RESIDENCE": 
            folium.Marker(location=loc,popup = folium.Popup(html=\
                f"District No: {district} <br> Description: {description} <br> Block: {block} <br> Location: {x}"),\
                          icon=folium.Icon(color='green', icon='ok-sign'),).add_to(marker_cluster)
#             print(f"District {district} green marker at {loc}")
        elif row['location_description']=="STREET":
            folium.Marker(location=loc, popup = folium.Popup(html=
                 f"District No: {district} <br> Description: {description} <br> Block: {block} <br> Location: {x}"),\
                          icon=folium.Icon(color='red',icon='remove-sign'),).add_to(marker_cluster)
        else: pass
 
In [66]:
gun_crime_arrests_map
Out[66]:
Make this Notebook Trusted to load map: File -> Trust Notebook